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Abstract —Building performance discrepancies between build¬ 
ing design and operation are one of the causes that lead many new 
designs fail to achieve their goals and objectives. One of main 
factors contributing to the discrepancy is occupant behaviors. 
Occupants responding to a new design are influenced by several 
factors. Existing building performance models (BPMs) ignore or 
partially address those factors (called contextual factors) while 
developing BPMs. To potentially reduce the discrepancies and 
improve the prediction accuracy of BPMs, this paper proposes a 
computational framework for learning mixture models by using 
Generative Adversarial Networks (GANs) that appropriately 
combining existing BPMs with knowledge on occupant behaviors 
to contextual factors in new designs. Immersive virtual envi¬ 
ronments (IVEs) experiments are used to acquire data on such 
behaviors. Performance targets are used to guide appropriate 
combination of existing BPMs with knowledge on occupant 
behaviors. The resulting model obtained is called an augmented 
BPM. Two different experiments related to occupants lighting 
behaviors are shown as case study. The results reveal that 
augmented BPMs significantly outperformed existing BPMs with 
respect to achieving specified performance targets. The case 
study confirms the potential of the computational framework 
for improving prediction accuracy of BPMs during design. 

Index Terms —occupant behavior, mixture model, building 
performance model, generative adversarial network, immersive 
virtual reality 

I. Introduction 

Building designs define characteristics, functions, and con¬ 
texts of buildings according to objectives and goals of a build¬ 
ing project. Building performance is an important component 
during designs that needs designers attention. It reflects how 
well buildings perform regarding to many components such 
as energy, occupants comforts, and control systems. Building 
performance models (BPMs) are tools that support designers to 
investigate, predict, and understand the performance of build¬ 
ings and make decisions during design. Several BPMs are used 
to optimize building performance during design, e.g., BPMs 
for predicting energy consumption (electricity consumption), 
BPMs for predicting building performance (heat loss and 
air quality), and BPMs for predicting occupants interactions 
with building components (light switches, blinds, and win¬ 
dows). For instance, designers use lighting BPMs to estimate 


occupants light switch behaviors. Empirical evidences have 
shown the existence of significant performance discrepancies 
between predictions during design and the actual performance 
of building operations [1], [2], The performance discrepancies 
may contribute to undesired buildings performance such as 
unexpected energy consumption, building degradation, and 
occupants discomfort. Many factors may contribute to the 
discrepancies. Occupant behaviors are one of the crucial 
contributing factors since they are uncertain, complex, and 
difficult to understand and model [3], Moreover, they may 
be influenced by many factors such as ones sense of control, 
building characteristics, building services systems and opera¬ 
tions, and climates, which make it challenging to accurately 
capture them while developing BPMs [4], 

Most BPMs are mathematically developed by finding the 
relationships between dependent and independent variables of 
interest. Generally, traditional methods, questionnaires [5], [6] 
and field studies [7], [8], are used to collect occupant behavior 
data (dependent variables) with respect to environmental fac¬ 
tors (independent variables). For instance. Hunt [9] used field 
study to observe occupants lighting behaviors in an existing 
building for almost a year. He used minimum working area 
illuminance as a predictor to predict whether occupants switch 
the light on. The main advantage of using traditional methods 
in acquiring data on occupant behaviors is that a large pool 
of continuous data can be obtained, which is suitable for 
developing BPMs. However, capabilities of traditional research 
methods for studying occupant behavior are limited in many 
aspects. First, such data only represent occupant behaviors in 
existing buildings. Contexts of existing buildings may differ 
from those of new designs, which may influence occupant 
behaviors differently. Second, since the data of occupant 
behaviors are obtained from existing buildings, some factors 
that influence occupant behaviors in new designs may not be 
captured (such as contextual factors). Contextual factors are 
generally ignored or partially addressed in existing BPMs. 
These limitations result in reduction in the predictive capability 
of existing BPMs that in turn gives rise to performance gaps 
between predictions made during design and actual buildings. 



IVEs can be alternative tools to support occupant behavior 
data collections. They are rich multisensory computer simula¬ 
tions that can mentally immerse users in the simulations. IVEs 
have been used in several research areas such as emergency 
situations [10], [11], driving behaviors [12], [13], and building 
designs [14], [15]. IVEs have been proven to be capable of 
simulating physical environments, providing senses of reality, 
and capturing users responses. 

The paper proposes a computational framework to reduce 
performance discrepancy between predictions made during 
design and actual building operation by combining knowledge 
about occupant behaviors responding to contextual and design- 
specific factors of new buildings with existing BPMs. IVEs are 
used as tools to capture occupant behaviors. The framework 
uses Generative Adversarial Networks (GANs) for learning 
mixture models that enable appropriately combining existing 
BPMs with knowledge of occupant behaviors obtained from 
IVE to produce augmented BPMs with improved predictive 
power. Performance targets are used as guides to achieve 
appropriate combination. The computational framework offers 
a novel approach for improving the prediction accuracy of 
BPMs during design and reduce the performance discrepancy 
between predictions and the actual operations. 

The contributions of this paper is: 

• We offer a computational framework to combine existing 
BPMs with knowledge of occupant behaviors responding 
to contextual factors of new building designs obtained 
from IVE experiments. The work contributes to the 
development of a novel approach for minimizing the 
discrepancy in building performances between predic¬ 
tions during designs and the actual performance during 
building operations, and thus allowing improved future 
building designs. 

II. Related Work 
A. Building Performance Models (BPMs) 

A lot of research have been devoted to developing tech¬ 
niques for creating BPMs. Examples of how researchers 
develop and use BPMs are summarized as follows. 

Hunt [9] developed a BPM for predicting manual lighting 
control based on a switch-on probability and minimum work¬ 
ing area illuminance. The BPM was developed by using field 
study data where sensors were installed in experimental offices 
to capture occupant interaction with artificial light switches. 
The BPM was expressed in terms of a statistical Probit model. 
Likewise, Nicol [16] developed BPMs to predict occupant 
windows, lighting, blinds, heaters, and fans usages based on 
outdoor temperature in naturally ventilated buildings from sur¬ 
vey data. Probit analysis was used to determine the relationship 
between occupant buildings usages and outdoor temperature. 
Newsham [17] developed and improved a computer-based 
thermal model FENESTRA by providing an algorithm to 
describe manual blind operation with respect to light switching 
described by Hunts model. From the results of his model, he 
suggested that occupant behavior may significantly influence 


predictions of thermal energy consumption. Reinhart [18] 
proposed an algorithm called Lightswitch-2002 to determine 
electric lighting energy demand of light switches. It was 
integrated into many simulation programs, such as design 
support tool (Lightswitch Wizard [19]), and whole building 
energy simulation tool (ESP-R [20]). The algorithm included 
an occupancy model, which considered profiles of occupants 
and minimum working area illuminance similar to Hunts 
approach, and a dynamic daylight simulation to predict electric 
lighting demand. The algorithm considered daytime switch-on 
proability in addition to probability of switching the light on 
upon arrival. Similary, Gunay et al. [21] formulated BPMs for 
adaptive lighting and blinds controls algorithm. Their BPMs 
include concurrent solar irradiance as an additional predictor 
for occupant lighting preferences, beside minimum working 
area illuminance and intermediate occupancy in other works. 

Traditionally, BPMs are developed based on data acquired 
using occupant behavior study approaches, namely question¬ 
naires and field study. Most of the existing BPMs are in 
form of the correlation between independent variables (envi¬ 
ronments and buildings design-specific factors) and dependent 
variable (occupant behaviors). The researchers illustrate the 
relationships by using statistical modelling such as regression 
models [9], [22]. 

B. Occupant Behavior Research Methods 

Questionnaires are a common method to study occupant 
behaviors. Questionnaires can be directed to subjects that 
researchers desire to investigate. They can also handle large- 
scale experiments. For example, Attia et al. [5] used question¬ 
naires to collect occupant behavior data related to household 
device usages in residential apartments in various areas in 
Egypt. They applied the data obtained from the questionnaires 
to construct benchmarks for building energy simulations. Feng 
et al. [6] used questionnaires to observe occupant behaviors 
related to air conditioning (AC) patterns. The information 
acquired from the questionnaires were used to categorize 
occupants switching on/off AC behaviors. Questionnaires are 
used in research on multiple aspects of interest in several 
places simultaneously. For instance, Nicol [16] studied oc¬ 
cupant behavior on windows, lightings, blinds, heat, and fans 
usage by using questionnaires in the UK, Pakistan, and Europe. 
Even though questionnaires provide various advantages, an im¬ 
portant disadvantage is that they are not able to quantitatively 
capture the relationship between the contexts and the occupant 
behaviors. 

The field monitoring method has been used in many studies 
such as light switching [23], predicting window opening [24], 
energy usage for space and water heating [7], occupants 
heating set-point [8], occupant interactions with shading and 
lighting [25], and occupant plug-in equipment use [26]. One 
of the advantages of this method is that the collected data 
are continuous and acquiring large samples is possible since 
multiple sensors are deployed with long period of time. 
Another advantage is that the method is capable of providing 
quantitative relationships between the occupant behaviors and 



the contexts. However, this method has many limitations, 
including (1) normally, data are collected in time intervals, 
e.g. every 30 minutes, and some critical events may be missed 
if they occur during the intervals, (2) other equipment may 
interfere with sensors and distort information of occupant 
behaviors and contexts, (3) many assumptions with respect 
to occupant behaviors and design contexts such as occupant 
schedules, variables that drive behaviors, and purposes of 
occupant response to building systems have to be made to 
derive the BPMs. 

C. Immersive Virtual Environment (IVE) 

Clearly, the traditional occupant behavior research methods 
described above typically rely on observations of occupants 
in existing buildings. Since occupant behaviors are context 
sensitive, findings from such observations can certainly contain 
biases and uncertainties. Thus, applying such findings to new 
designs may lead to significant variances in predictions. We 
suggest an alternative method to study and observe occupant 
behaviors during building designs by using immersive virtual 
environments (IVEs). There are several reasons showing that 
IVEs are good candidates for studying and observing occupant 
behaviors in buildings. For instance, IVEs allow users to 
control confounding and isolating variables of interest, to be 
immersed in their settings, and to constantly maintain variables 
of interest during conducting experiments [27]. Previous works 
that show the abilities of IVEs as alternative tools to study 
occupant behaviors are summarized as follows. 

In human behaviors related studies, Heydarian et al. [27] 
used IVEs to study occupant behaviors related to lighting and 
shade usages. Saeidi et al. [28] evaluated data on occupants 
lighting behaviors acquired from IVEs and showed that IVEs 
were capable of replicating experiences in physical environ¬ 
ments. A framework for integrating bulding designs with IVEs 
was also developed by Niu et al. [15], The purpose of the 
framework was to help building designers capture occupant 
preferences and identify context patterns. They concluded that 
integrating building designs with IVEs using their framework 
helped designers to understand occupant behavior and identify 
design contexts that guide occupants to act corresponding to 
design intentions. Another work of Saeidi et al. [29] conducted 
an experiment to study occupants lighting preferences in IVEs 
and compared the resulting data with respect to that collected 
from physical sensors. They found good agreement between 
the occupants preferences in IVEs and those in actual physical 
environments. 

D. Generative Advesarial Networks (GANs) 

Deep learning has grown in popularity in recent years [30]- 

[34] , Generative Adversarial Networks (GANs) were proposed 
in [31]. GANs have been successfully used in various domains 

[35] , especially image synthesis. 

Ledig et al. [36] used GANs to learn and recover photo¬ 
realistic textures from downsampled images. They proposed 
super-resolution GANs (SRGANs) that can estimate photo¬ 
realistic super-resolution images with high upscaling factors. 


Radford et al. [37] introduced deep convolutional genera¬ 
tive adversarial networks (DCGANs) for generating realistic 
and high resolution images. They showed that DCGANs 
outperformed other unsupervised algorithms (K-means, Ran¬ 
dom Forest (RF), and Transductive Support Vector Machines 
(TSVMs)). Wang and Gupta [38], introduced Style and Struc¬ 
ture Generative Adversarial Networks (S2-GANs). S2-GANs 
addressed structure and style in image generation process. S2- 
GANs have abilities to produce more realistic high-resolution 
images, in addition to having a more robust and stable training 
method compared to standard GANs. Apart from 2D image 
generation, Wu et al. [39] introduced 3D-GANs that were 
capable of generating 3D objects by combining volumetric 
convolutional networks with GANs. 

From the previous works, we have seen abilities of GANs 
to produce synthetic images that are close to real images from 
arbitrary image clues (noises). Correspondingly, we use GANs 
to produce augmented BPMs that are close to the performance 
targets by combining existing BPMs with the knowledge on 
occupant behaviors responding to contextual factors in new 
building designs. 

III. Framework of Mixture Model 
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Fig. 1. Framework of proposed mixture model. 


Due to the lack of ability to accurately model occupant 
behaviors in existing BPMs for new designs, we propose 
a framework to enhance BPMs by appropriately combining 
existing BPMs and with knowledge of occupant behaviors in 
new design obtained from IVE experiments (IVE datasets). 
There are four major components involved in the framework, 
namely an existing BPM, occupant responses in a new design, 
a performance target, and Generative Adversarial Networks 
(GANs). An existing BPM is a BPM that is constructed 
from occupant behavior data in an existing building. Occupant 
responses in a new design are occupant behavior data that 
obtained from an IVE experiment, which exposes the occupant 
with an environment of a new design and considers factors 










































that are ignored in an existing BPM (contextual factors). A 
performance target is used as a guide for combining an existing 
BPM and occupant response in a new design such as building 
benchmarks, historically desired occupant data, and desired 
building performance. GANs [31] are used to create mixture 
models that allow appropriate combination of an existing BPM 
and knowledge of occupant behaviors as obtained from IVE 
experiments guided by a performance target (Fig. 1). In the 
framework, we define the dataset obtained by sampling IID 
from an existing BPM as the existing BPM dataset. GANs 
comprise of two major parts: a generator and a discriminator. 
The generator is an artificial neural network (ANN) which 
uses the existing BPM dataset and the IVE dataset as input 
and produces outputs as a mixture distribution of the existing 
BPM dataset and the IVE dataset (called augmented BPM). 
The performance predicted based on the resulting mixture 
distribution is intended to be as close as possible to the given 
performance targets. The discriminator is an ANN that tries 
to discriminate between the performance predictions obtained 
from mixture distribution generated by the generator and the 
performance target. During training, the generator and the 
discriminator play a minimax game with each other where 
the generator tries to produce a mixture distribution so that 
the performance targets are met, and the discriminator tries 
to determine if the generator meets the performance targets. 
The trainings continue until a defined convergence criterion 
(maximum iterations, discrepancy measured between the pre¬ 
dictions of the generator and the targets is below a threshold) 
is reached. Once training converges, the resulting generator 
obtained is the augmented BPM. 

IV. Case Study 

A. Existing Building Performance Models and Targets 

Two experiments related to occupant light switching behav¬ 
iors are conducted. In the first experiment, a model for pre¬ 
dicting occupant light switching behaviors developed by [9] is 
used as the existing BPM. For the performance target, we use 
the probabilities of switching on as provided by a probit model 
described in [22]. In the second experiment, the existing BPM 
consisted of a model for predicting occupant light switching 
behaviors developed by [21]. The performance target is similar 
to the performance target in the first experiment. In the existing 
BPMs, Probit regression is used to represent the relationships 
between probabilities of an occupant switching on and work 
area illuminance as shown below: 

c 

P a + 1 + exp(-(dm + bEi ux )) 

where: 

p = probability of switching the light on, 

Ei ux = the working area illuminance (lux), 

a, b , c, d, m = constants given in TABLE I. 

Independent and identically distributed (IID) samples of the 
existing BPMs and the performance targets are generated by 
using Monte Carlo simulations. Data of Ei ux are randomly 
sampled using a normal distribution. The data are taken as 


TABLE I. Existing BPMs and Performance Targets 



Experiment 1 

Experiment 2 

Existing 

BPM 

a = -0.0175 

b = -4.0835 

c = 1.0361 

d = -4.0835 

m = -1.8223 

E iux = logiolux 

a = 0 

b = -0.005 

c— 1 

d= 1 

m = -0.170 

Eiux = lux 

Performance 

Target 

a = 0 

b = -0.003 

c = 1 

d = 1 

m = 2.035 

Eiux — logiglux 

a = 0 

b = -0.003 

c = 1 

d= 1 

m = 2.035 

Eiux = lux 


inputs to compute outputs (probability of switching on ( p )) by 
using (1). Data of p and Ei ux are used in the computational 
framework. 

B. Occupant Light Switching Behaviors in New Design 



Fig. 2. The Virtual Single-Occupancy Office. 


Data of occupant behaviors of new designs are retrieved 
from a previous study [29], Saeidi et al. [29] used IVE to 
study occupant light switching behaviors in a virtual single¬ 
occupancy office as shown in Fig. 2. The IVE experiments 
were setup by manipulating critical events of the data obtained 
from the physical environment (e.g., arrival at the office, 
intermediate leaving, coming back from intermediate leaving, 
and departure; see TABLE II). Each event includes values of 
contextual factor variables (e.g., indoor and outdoor illumi¬ 
nance, intermediate leaving status, and occupancy status) in 
new-design buildings. The contextual factors (see TABLE II) 
were exposed to an occupant in event based experiments. The 
occupants interactions with the light switch were captured. 
For instance, the occupant switched the light on when indoor 
and outdoor were dark. A total of 180 data points relating 
to occupant preferences (lighting) and values of contextual 
factor variables (indoor and outdoor illuminance, intermediate 
leaving status, and occupancy status) were acquired from the 
IVE experiments; 36 initial events before arrival at the office, 
36 events of arrival at the office, 18 events of intermediate 









short leave, 18 events of returning from the intermediate short 
leave, 18 events of intermediate long leave, 18 events of 
returning from the intermediate long leave, and 36 events of 
departure. 

Due to small sample size of the IVE data and the fact 
that the experiment is sequence-events, data augmentation are 
performed. A Hidden Markov Model (HMM) Baum-Welch 
algorithm [40] is trained on the data obtained from the IVE 
experiment which is then used to generate synthetic samples 
IID. 

In the HMM, the hidden states and the observations of 
events are classified. The status of the light switch are classi¬ 
fied as the hidden states. The statuses of the other variables, 
namely occupancy status, intermediate leaving, outdoor illumi¬ 
nance, and working area illuminance are classified as observa¬ 
tions. Each observation vector is encoded to an ordinal variable 
by combining statuses of factors. For instance, non-occupancy, 
short intermediate leaving, bright outdoor illuminance, bright 
work area illuminance are combined as no + short + bright 
+ bright and encoded by using a single value such as 1. The 
transition and observation probabilities are calculated based on 
obtained IVE data. The HMM learns the relationship between 
the hidden states and observations from the transition and 
observation probabilities. After training process finishes, the 
IID synthetic sequence of events and observations (the IID 
synthetic IVE dataset) are randomly synthesized through the 
trained HMM [41], 

TABLE II. Statuses of Factors 


Contextual Factor 

Status 

Occupancy 

Non-Occupancy 

Occupancy 


Dark (200 Lux) 

Outdoor Illuminance 

Normal (500 Lux) 


Bright (700 Lux) 

Intermediate Leaving 

None 

Short leaving 


Long leaving 

Independent Variable 

Status 

Work Area Illuminance 

Dark (200 Lux) 

Normal (500 Lux) 


Bright (700 Lux) 


C. Generative Adversarial Network (GAN) 

1) Data Organization: Since the existing BPM and the 
target datasets have only working area illuminance as an 
independent variable, the missing data for contextual factors 
in the existing BPM and the target datasets, e.g., occupancy 
and intermediate leaving statuses are randomly generated 
from those of the synthetic IVE dataset. For instance, since 
occupancy status in the synthetic IVE dataset include non¬ 
occupancy and occupancy, the data of occupancy in the exist¬ 
ing BPM dataset are randomly generated with non-occupancy 
and occupancy. Corresponding to the status of intermediate 


leaving in the synthetic IVE dataset, the data for intermediate 
leaving in the existing BPM are randomly generated with none, 
short, and long leaving. 

2) Computation: In both experiments, we provide the gen¬ 
erator (G) using an existing BPM and the synthetic IVE 
datasets (z) as input. The existing BPM and the synthetic 
IVE datasets are combined by concatenating. The generator 
is an ANN consisting of a three-layer perceptron network 
including an input, two hidden, and an output layer. The 
inputs in the input layer are the occupancy status, intermediate 
leaving, and working area illuminance. The output in the 
output layer is the probability of switching the light on. The 
hidden layers of the network comprise 300 hidden neurons 
each with rectified linear unit activation function (ReLU) since 
it has been shown to have better fitting ability than the sigmoid 
function in similar applications [42]. To prevent overfitting, 
elastic net regularization (combination of LI (Laplacian) and 
L2 (Gaussian) penalties) is used [43]. The sigmoid activation 
function is applied at the output neuron because the outputs 
are probabilities. The loss function of the model is binary 
cross entropy (logistic regression). The learning rate and 
regularization are 10” 6 . 

The discriminator (D) is an ANN used to discriminate 
the outputs from the generator and the performance targets. 
The discriminator comprises of a three-layer ANN including 
an input, two hidden, and an output layer. The setup of 
the discriminator is similar to the generator except that the 
activation functions at the hidden layers are Leaky ReLUs. 
Two datasets, i.e., the output of the generator and the targets 
are combined by concatenating. The labels of the two datasets 
are defined as 0 (the output of the generator) and 1 (the 
performance targets). 

Based on [31], to learn a generator distribution p g over 
the performance target ( x ), the generator builds a mapping 
function from the combination of existing BPM dataset and 
synthetic IVE dataset distribution p z (z) to generate data space 
G{z;0 g ). The data space of the discriminator D(x;6d) will 
output the probability that x came from the performance 
target distribution ( Pdata ) rather than p g . Based on [31], we 
train G and D together using backpropagation that minimizes 
log{ 1 — D(G(z))) + logD(x). This is equivalent to playing 
a minimax game between G and D with the value function 
V(D,G). The combinations of the two datasets were used 
as the input and the labels were used as the outputs in the 
discriminator. 

If we use traditional GANs, the discriminator is confronted 
with the difficulty of accurately discriminating outputs of 
the generator and the targets since there is only one feature 
(probability of switch the light on) as the input for the 
discriminator. To solve the problem, we partially adapt the 
concept of conditional GANS [44] by using information of 
input features of the generator (occupancy status, intermediate 
leaving status, and working area illuminance) as additional 
inputs to the discriminator model. The scheme of GAN of the 
computational framework is shown in Fig. 3. Therefore, the 
value function V(D,G) becomes [31], [44]: 





Fig. 3. Scheme of GAN of The Case Study. 


mvnmax V(D , G) = '& xr ^ Pdata ^[logD(x\z)\ J r 

E z~ Pz (z)[log(l - D(G(z)))\ 

For clarity, we summarize the corresponding pseudo-code 
of the optimization algorithm of the computational framework 
in Algorithm 1 [31], [44]. 

V. Results 

A. Comparisions of Performance of BPMs 

The probabilities of switching on are randomly sam¬ 
pled from augmented BPMs, existing BPMs, synthetic IVE 
datasets, and the performance target. The probabilities of 
switching on are compared among three models. The mean 
absolute errors (MAEs) are used to determine the performance 
of each BPM against targets by using (5). 

n 

Ely* -Xi\ 

MAE = — - (5) 

n 

where: 

i = ranges over the list of data points, i.e., work area 
illuminances (i = 1,2,3,..., n), 

Ui = refers to the probability of switching on at data 
point i as specified by the performance targets, 

Xi = refers to the probability of switching on at data 

point i of the augmented BPM (resp. existing BPM, 
resp. synthetic IVE dataset). 

The results of the experiments are plotted in Fig. 4a and 
Fig. 4b to visually distinguish the performance of BPMs. 
The MAEs are shown in TABLE III. From TABLE III, the 
MAEs measured between probabilities of switching on as 
predicted by the augmented BPMs and that specified by the 
performance target are smallest compared to that predicted by 
the existing BPM or acquired from the synthetic IVE data in 


Algorithm 1 The Optimization of The Framework. All exper¬ 
iments in the paper used the default values a = r = 10 6 , m 
= 2000, n = 2e5._ 

Require: a , the learning rate, r, regularization, m, the batch 
size, n, the number of epochs. 

1 : for n do 

2 : Train the discriminator 

3: Sample batch of 2m samples, (z(i), ... ,Z( 2 m j), from 

the generator distribution p g (z). To make additional inputs 
in the discriminator, samples (z) include inputs of the 
generator. 

4: Sample batch of 2m samples from performance 

tttrget PPtargets (^) • 

5: Train the discriminator by using backpropagation with 

stochastic gradient ascent [31], [44]: 

1 2m 

m ^ T,v°g D ( X {i) i * (i) )+mi - D(G(z^m 

1 i=1 

(3) 

6: Train the generator 

7: Sample batch of m samples from existing BPMs 

dataset. 

8: Sample batch of m samples from Synthetic IVE 

dataset. 

9: Combine samples of existing BPM dataset and IVE 

dataset by concatenating, ( 2 ( 1 ), ... ,Z( 2m )). 

10: Train the generator by using backpropagation with 

stochastic gradient descent [31], [44]: 

2m 

(4) 

1 i= 1 

11: end for 

























































both experiments. The results can be interpreted as evidence 
that the augmented BPMs outperform both existing BPMs and 
IVEs. 

TABLE III. Results of MAEs 



Experiment 1 

Experiment 2 | 


Augmented 

BPM 

Existing 

BPM 

Synthetic 

IVE 

Augmented 

BPM 

Existing 

BPM 

Synthetic 

IVE 

MAE 

0.17 

0.48 

0.47 

0.14 

0.4! 

0.47 | 


B. Tests of The Performance of Augmented BPMs 

To show that the predictions obtained from the augmented 
BPMs produced by the computational framework outperform 
the that obtained from existing BPMs and the probabilities 
acquired from the synthetic IVE dataset, we apply statistical 
analysis to find significant difference of errors measured 
between: 1) the performance targets and the existing BPMs, 
and 2) the performance targets and the synthetic IVE dataset, 
and 3) the performance targets and the augmented BPMs. 

The performance of the existing BPMs, the IVE, and the 
augmented BPMs are investigated by using absolute errors as 
measured values as shown in TABLE IV. 

TABLE IV. Comparison of Performance of BPMs 


Absolute 

error 

Explanation 

Ei 

probability of switching the light on obtained from 
the existing BPM — the performance target| 

e 2 

probability of switching the light on obtained from 
the IVE — the performance target | 

e 3 

probability of switching the light on obtained from 
the augmented BPM — the performance target| 


To statistically test the significance of the performances of 
augmented BPMs for both experiments, hypotheses are defined 
as follows: 

To test the performance of the augmented BPMs and the 
existing BPMs, hypothesis 1 is defined as follow: 

Hq\ mean of E\ — mean of TJ :i = 0 

Hi: mean of Ei — mean of E 3 > 0 

To test the performance of the augmented BPMs and the 
IVE, hypothesis 2 is defined as follow: 

H 0 : mean of E-> — mean of E$ = 0 

H\\ mean of E> — mean of E% > 0 

A one tailed t-test (a = 0.05) was applied to investigate 
statistically significant difference between the performance of 
the augmented BPMs, and the existing BPMs as well as the 
IVE. The results are shown in TABLE V. 

From TABLE V, the null hypotheses were rejected for all 
cases. Based on the hypotheses testing, we conclude that, 
the probabilities of switching the light on estimated by the 


TABLE V. Results of the Hypothesis Testing 



Experiment 1 

Experiment 2 

Hypothesis 

Hypothesis 

i 

2 

i 

2 

Absolute 

T-value 

44.300 

17.873 

53.535 

19.377 

P-value 

< 0.05 

< 0.05 

< 0.05 

< 0.05 

Ho 

Reject 

Reject 

Reject 

Reject 


augmented BPM are significantly closer to the performance 
targets than that estimated by the existing BPMs or the 
(synthetic) IVE dataset. This shows a strong potential of 
using the computational framework to enhance performance of 
BPMs and reduce performance discrepancy between prediction 
during designs and operational buildings. 

VI. Conclusion 

The paper presents a computation framework to reduce the 
performance discrepancy between predictions during designs 
and the actual performance observed when building is in 
operation. GANs are used to learn a mixture model that allows 
appropriate combination of existing BPMs with knowledge of 
occupant behaviors responding to contextual factors in new 
designs as obtained from IVE experiments. 

The results of the experiments show promising potential of 
the computational framework for reducing the performance 
discrepancy. From the evidence in TABLE V, the augmented 
BPMs from both experiments outperform existing BPMs and 
IVEs. 

In the future work, uncertainties have to be considered to 
improve the performance of the framework. There are many 
factors that may contribute to uncertainties such as quality of 
IVE datasets, existing BPMs, and the system of the framework. 
More IVE experiments need to be conducted to investigate 
and improve the performances of IVEs in occupant behavior 
study and enhance accuracy of the framework. Furthermore, 
the quality of IVE datasets may be dependent on many 
elements such as cues, instrument, and occupants. Study of 
cues may need to be explored to enhance the quality of IVE 
datasets. Since the data of existing BPMs are obtained from 
occupant behaviors in existing buildings, specified constraints 
on types and behaviors of occupants may need to be defined 
corresponding to occupant in new design. The algorithm of 
the framework may need to be further improved to increase 
accuracy of augmented BPMs. 
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